Geospatial
Region Name simd Geometry Falkirk 5 instruction(long_1, lat_1,…) Glasgow City 6 instruction(long_g1, lat_g1,…)
Simple Features Each row in a spatial dataframe/tibble is a feature Similar to how we used a specifc package to work with time series data, we are going to use s specific packahe to work with geospatial data
install.packages(c("sf", "rgeos","rnaturalearth", "rnaturalearthdata"))
There is a binary version available but the source version is later:
no
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.2/sf_1.0-7.tgz'
Content type 'application/x-gzip' length 88883512 bytes (84.8 MB)
==================================================
downloaded 84.8 MB
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.2/rgeos_0.5-9.tgz'
Content type 'application/x-gzip' length 1609563 bytes (1.5 MB)
==================================================
downloaded 1.5 MB
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.2/rnaturalearth_0.1.0.tgz'
Content type 'application/x-gzip' length 220835 bytes (215 KB)
==================================================
downloaded 215 KB
trying URL 'https://cran.rstudio.com/bin/macosx/contrib/4.2/rnaturalearthdata_0.1.0.tgz'
Content type 'application/x-gzip' length 3234794 bytes (3.1 MB)
==================================================
downloaded 3.1 MB
The downloaded binary packages are in
/var/folders/h4/vy1nwyvd5kz_n6950nl29pj80000gn/T//RtmpGopTXG/downloaded_packages
library(sf)
Linking to GEOS 3.10.2, GDAL 3.4.2, PROJ 8.2.1; sf_use_s2() is TRUE
simple feature simple mean don’t self intersect
north_carolina <- st_read(system.file("shape/nc.shp", package = "sf"))
Reading layer `nc' from data source
`/Library/Frameworks/R.framework/Versions/4.2/Resources/library/sf/shape/nc.shp'
using driver `ESRI Shapefile'
Simple feature collection with 100 features and 14 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
Geodetic CRS: NAD27
plot(north_carolina['AREA'])
nrow(north_carolina)
[1] 100
#100 features
names(north_carolina)
[1] "AREA" "PERIMETER" "CNTY_" "CNTY_ID" "NAME" "FIPS" "FIPSNO"
[8] "CRESS_ID" "BIR74" "SID74" "NWBIR74" "BIR79" "SID79" "NWBIR79"
[15] "geometry"
#how to draw the first feature
nc_geo[[1]]
MULTIPOLYGON (((-81.47276 36.23436, -81.54084 36.27251, -81.56198 36.27359, -81.63306 36.34069, -81.74107 36.39178, -81.69828 36.47178, -81.7028 36.51934, -81.67 36.58965, -81.3453 36.57286, -81.34754 36.53791, -81.32478 36.51368, -81.31332 36.4807, -81.26624 36.43721, -81.26284 36.40504, -81.24069 36.37942, -81.23989 36.36536, -81.26424 36.35241, -81.32899 36.3635, -81.36137 36.35316, -81.36569 36.33905, -81.35413 36.29972, -81.36745 36.2787, -81.40639 36.28505, -81.41233 36.26729, -81.43104 36.26072, -81.45289 36.23959, -81.47276 36.23436)))
The big seven (shapes that a feature can be)
1.point(1 1) 2.multipoint ((1 1), (2,2)) 3.line (1 1, 1 2) 4.multi line ((1 1, 1 2), (2 2, 3 3)) 5.polygon (1 1, 1 2, 2 2, 2 1, 1 1) 6.multi polygon ((1 1, 1 2, 2 2, 2 1, 1 1)) 7.geomtry collection (point( 1 1 ), mutiline((…)))
geomtry collection can be made up of multiple shape instruction, and any number of tem, just not a geometry collection
shetland would be a multipolygon made up of several island polygon
plot(nc_geo[[1]])
library(tidyverse)
Registered S3 methods overwritten by 'dbplyr':
method from
print.tbl_lazy
print.tbl_sql
── Attaching packages ───────────────────────────────────────────────── tidyverse 1.3.2 ──✔ tibble 3.1.8 ✔ dplyr 1.0.9
✔ tidyr 1.2.0 ✔ stringr 1.4.0
✔ readr 2.1.2 ✔ forcats 0.5.1
✔ purrr 0.3.4 ── Conflicts ──────────────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
geom_sf
ggplot(north_carolina)+
geom_sf(aes(fill = AREA), size = 0.1, colour = "black")+
theme_bw()
class(north_carolina)
[1] "sf" "data.frame"
ggplot(north_carolina)+
geom_sf(aes(fill = BIR79 ), size = 0.1, colour = "black")+
theme_bw()
ggplot(north_carolina)+
geom_sf()+
geom_sf(data = north_carolina %>% filter(AREA == max(AREA)), fill = "red")
tsibble –> as_tibble –> tibble sf df –> st_drop_geometry –> df(tibble)
this is about thw world
##subsetting simple feature
library(rnaturalearth)
library(rnaturalearthdata)
world<- ne_countries(scale = "medium",returnclass = "sf")
241 features of 64 variable
world %>%
st_drop_geometry() %>%
head()
ggplot(world) +
geom_sf(aes(fill = pop_est), size = 0.1)+
scale_fill_viridis_c(trans = "sqrt")
NA
one thing we can do to “better” visualise the different populations between countries is …> apply a transformation
(indian and chian still most popu countires, but because we reduce how much more populated they are in )
ggplot(world) +
geom_sf(aes(fill = gdp_md_est), size = 0.1)+
scale_fill_viridis_c(trans = "sqrt")
picking out individual coutries
world %>%
filter(name =="Italy") %>%
ggplot()+
geom_sf()
zooming in
world %>%
ggplot() +
geom_sf(aes(fill = economy), size = 0.1) +
coord_sf(xlim = c(-102.15, -74.12), ylim = c(7.65, 33.9), expand = FALSE)+
geom_sf_text(aes(label = name), size = 2, check_overlap = TRUE)+
scale_fill_brewer(palette = "Dark2") +
ggtitle("Economic Rating") +
theme(legend.position = "bottom")
creating interactive geospatial data visulisation
library(leaflet)
leaflet() %>%
addTiles() %>%
addMarkers(
lng = 174.768, lat = -36.852, popup = "Birthplace of R"
)
circle markers
library(jsonlite)
Attaching package: ‘jsonlite’
The following object is masked from ‘package:purrr’:
flatten
data_url <- "https://data.colorado.gov/resource/j5pc-4t32.json?&county=BOULDER"
water_data <- fromJSON(data_url) %>%
jsonlite::flatten(recursive = TRUE)
data prep amount, latitude and longtitude are not numeric but they should be
water_data_clean<-water_data %>%
mutate(across(
c(amount, location.latitude, location.longitude),
as.numeric
)) %>%
filter(!is.na(location.latitude))
leaflet(water_data_clean) %>%
addTiles() %>%
addCircles(
lng = ~location.longitude,
lat = ~location.latitude,
radius = ~amount/10,
#size depend on water amount
popup = ~paste("water:", amount, sep = " ")
)
leaflet(water_data_clean) %>%
addTiles() %>%
addMarkers(
lng = ~location.longitude,
lat = ~location.latitude,
popup = ~paste("water:", amount, sep = " "),
clusterOptions = markerClusterOptions()
)
plot(nc_geo[[4]])
plot(st_simplify(nc_geo, dTolerance = 2500)[[4]])
simlpify increasing tolerance will decrease the number of points
by simlifying we can reduce computational runtime
reading/writing shapefiles
#not standerd to read in a system file, mostly for example
st_read(system_file())
north_carolina %>%
st_simplify(dTolerance = 2000) %>%
st_write(dsn = "nc_simp", layer = "simp_nc", driver = "ESRI Shapefile")
Writing layer `simp_nc' to data source `nc_simp' using driver `ESRI Shapefile'
Writing 100 features with 14 fields and geometry type Multi Polygon.
read in
nc_simp
Simple feature collection with 100 features and 14 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
Geodetic CRS: NAD27
First 10 features:
AREA PERIMETER CNTY_ CNTY_ID NAME FIPS FIPSNO CRESS_ID BIR74 SID74 NWBIR74
1 0.114 1.442 1825 1825 Ashe 37009 37009 5 1091 1 10
2 0.061 1.231 1827 1827 Alleghany 37005 37005 3 487 0 10
3 0.143 1.630 1828 1828 Surry 37171 37171 86 3188 5 208
4 0.070 2.968 1831 1831 Currituck 37053 37053 27 508 1 123
5 0.153 2.206 1832 1832 Northampton 37131 37131 66 1421 9 1066
6 0.097 1.670 1833 1833 Hertford 37091 37091 46 1452 7 954
7 0.062 1.547 1834 1834 Camden 37029 37029 15 286 0 115
8 0.091 1.284 1835 1835 Gates 37073 37073 37 420 0 254
9 0.118 1.421 1836 1836 Warren 37185 37185 93 968 4 748
10 0.124 1.428 1837 1837 Stokes 37169 37169 85 1612 1 160
BIR79 SID79 NWBIR79 geometry
1 1364 0 19 MULTIPOLYGON (((-81.45289 3...
2 542 3 12 MULTIPOLYGON (((-81.23989 3...
3 3616 6 260 MULTIPOLYGON (((-80.45634 3...
4 830 2 145 MULTIPOLYGON (((-75.94193 3...
5 1606 3 1197 MULTIPOLYGON (((-77.23461 3...
6 1838 5 1237 MULTIPOLYGON (((-76.7075 36...
7 350 2 139 MULTIPOLYGON (((-75.95718 3...
8 594 2 371 MULTIPOLYGON (((-76.46035 3...
9 1190 2 844 MULTIPOLYGON (((-78.30876 3...
10 2038 5 176 MULTIPOLYGON (((-80.02406 3...
pal <- colorBin("Purples", nc_simp$BIR74, n =5)
labels <- paste0(nc_simp$NAME, nc_simp$BIR74)
leaflet(nc_simp) %>%
addTiles() %>%
addPolygons(weight = 1, color = "black", fillColor = ~pal(BIR74),
fillOpacity = 100,
label = labels)
Warning: sf layer has inconsistent datum (+proj=longlat +datum=NAD27 +no_defs).
Need '+proj=longlat +datum=WGS84'
%>% addLegend(pal = pal, values = ~BIR74 colours = ~pal(BIR74), position = “bottomright” )